CLIR Experiments at Maryland for TREC 2002: Evidence Combination for Arabic-English Retrieval
نویسندگان
چکیده
The focus of the experiments reported in this paper was techniques for combining evidence for crosslanguage retrieval, searching Arabic documents using English queries. Evidence from multiple sources of translation knowledge was combined to estimate translation probabilities, and four techniques for estimating query-language term weights from document-language evidence were tried. A new technique that exploits translation probability information was found to outperform a comparable technique in which that information was not used. Comparative results for three variants of Arabic “light” stemming are also presented. A simple variant of an existing stemming algorithm was found to result in significantly better retrieval effectiveness.
منابع مشابه
TREC-10 Experiments at University of Maryland CLIR and Video
The University of Maryland Researchers participated in both the Arabic-English Cross Language Information Retrieval (CLIR) and Video tracks of TREC-10. In the CLIR track, our goal was to explore effective monolingual Arabic IR techniques and effective query translation from English to Arabic for cross language IR. For the monolingual part, the use of the different index terms including words, s...
متن کاملBuilding an Arabic Stemmer for Information Retrieval
In TREC 2002 the Berkeley group participated only in the English-Arabic cross-language retrieval (CLIR) track. One Arabic monolingual run and three English-Arabic cross-language runs were submitted. Our approach to the crosslanguage retrieval was to translate the English topics into Arabic using online English-Arabic machine translation systems. The four official runs are named as BKYMON, BKYCL...
متن کاملTowards a New Standard Arabic Test Collection for Mono- and Cross-Language Information Retrieval
We propose in this paper a new standard Arabic test collection for monoand cross-language Information Retrieval (CLIR). To do this, we exploit the “Hadith” texts and we provide a portal for sampling and evaluation of Hadiths’ results listed in both Arabic and English versions. The new called “Kunuz” standard Arabic test collection will promote and restart the development of Arabic mono retrieva...
متن کاملTREC-8 Experiments at Maryland: CLIR, QA and Routing
The University of Maryland team participated in four aspects of TREC-8: the ad hoc retrieval task, the main task in the cross-language retrieval (CLIR) track, the question answering track, and the routing task in the filtering track. The CLIR method was based on Pirkola’s method for Dictionary-based Query Translation, using freely available dictionaries. Broad-coverage parsing and rule-based ma...
متن کاملTREC Experiments at Maryland CLIR QA and Routing
The University of Maryland team participated in four aspects of TREC the ad hoc retrieval task the main task in the cross language retrieval CLIR track the question answering track and the routing task in the ltering track The CLIR method was based on Pirkola s method for Dictionary based Query Translation using freely available dictionaries Broad coverage parsing and rule based matching was us...
متن کامل